# Golden Master / Baseline Drift — Deep Settings-Drift (Content-Fidelity) Analysis > Enterprise Research Report for TenantAtlas / TenantPilot > Date: 2025-07-15 > Scope: Architecture, code evidence, implementation proposal --- ## Table of Contents 1. [Executive Summary](#1-executive-summary) 2. [System Map — Side-by-Side Comparison](#2-system-map) 3. [Architecture Decision Record (ADR-001): Unify vs Separate](#3-adr-001) 4. [Deep-Dive: Why Settings Changes Don't Produce Baseline Drift](#4-deep-dive) 5. [Code Evidence Table](#5-code-evidence) 6. [Type Coverage Matrix](#6-type-coverage-matrix) 7. [Proposal: Deep Drift Implementation Plan](#7-deep-drift-plan) 8. [Test Plan (Enterprise)](#8-test-plan) 9. [Open Questions / Assumptions](#9-open-questions) 10. [Key Questions Answered (KQ-01 through KQ-06)](#10-key-questions) --- ## 1. Executive Summary 1. **Two parallel drift systems exist**: *Baseline Compare* (meta fidelity, inventory-sourced) and *Backup Drift* (content fidelity, PolicyVersion-sourced). They share `DriftHasher` but are otherwise separate data paths with separate finding generators. 2. **The core gap**: `CompareBaselineToTenantJob` hashes `InventoryMetaContract` v1 — which contains only `odata_type`, `etag`, `scope_tag_ids`, `assignment_target_count` — never actual policy settings. When an admin changes a Wi-Fi password or a compliance threshold in Intune, _none of these meta signals necessarily change_. 3. **Inventory sync uses Graph LIST endpoints**, which return metadata and display fields only. Per-item GET (which fetches settings, assignments, scope tags) is only performed during _Backup_ via `PolicyCaptureOrchestrator`. 4. **`DriftFindingGenerator`** (the backup drift system) _does_ detect settings changes — it normalizes `PolicyVersion.snapshot` via `SettingsNormalizer` → `PolicyNormalizer::flattenForDiff()` → type-specific normalizers, then hashes with `DriftHasher`. 5. **Spec 116 already designs v2** with a provider precedence chain (`PolicyVersion → Inventory content → Meta fallback`), which is the correct architectural direction. The v1 meta baseline shipped first as a deliberate, safe-to-ship initial milestone. 6. **Unification is recommended** (provider chain approach) — not merging the two jobs, but enabling `CompareBaselineToTenantJob` to optionally consume `PolicyVersion` snapshots as a content-fidelity provider, falling back to InventoryMetaContract when no PolicyVersion is available. 7. **28 supported policy types** are registered in `tenantpilot.php`, plus 3 foundation types. Of these, 10+ have complex hydration (settings catalog, group policy, security baselines, compliance actions) and would benefit most from deep-drift detection. 8. **The `etag` signal** is unreliable as a settings-change proxy: Microsoft Graph etag semantics vary per resource type, and etag may or may not change when settings are modified. It is useful as a _hint_ but not a _guarantee_. 9. **API cost is the primary constraint**: content-fidelity compare requires per-item GET calls (or a recent Backup that already captured PolicyVersions). The hybrid provider chain avoids this by opportunistically _reusing_ existing PolicyVersions without requiring a full backup before every compare. 10. **Coverage Guard is critical for v2**: the baseline system must know _which types have fresh PolicyVersions_ and suppress content-fidelity findings for types where no recent version exists (falling back to meta fidelity). 11. **Risk profile**: Shipping deep-drift for wrong types (without proper per-type normalization) could produce false positives. Type-specific normalizers already exist for the backup drift path; reusing them is safe. 12. **Recommended phasing**: v1.5 (current sprint) = add `content_hash_source` column to `baseline_snapshot_items` + provider chain in compare job. v2.0 = on-demand per-item GET during baseline capture for types lacking recent PolicyVersions. --- ## 2. System Map ### Side-by-Side Comparison Table | Dimension | System A: Baseline Compare | System B: Backup Drift | |---|---|---| | **Entry point** | `CompareBaselineToTenantJob` | `GenerateDriftFindingsJob` → `DriftFindingGenerator` | | **Data source (current)** | `InventoryItem` (from LIST sync) | `PolicyVersion` (from per-item GET backup) | | **Data source (baseline)** | `BaselineSnapshotItem` (captured from inventory) | Earlier `PolicyVersion` from prior `OperationRun` | | **Hash contract** | `InventoryMetaContract` v1 → `DriftHasher::hashNormalized()` | `SettingsNormalizer` → `PolicyNormalizer::flattenForDiff()` → `DriftHasher::hashNormalized()` | | **Hash inputs** | `version`, `policy_type`, `subject_external_id`, `odata_type`, `etag`, `scope_tag_ids`, `assignment_target_count` | Full `PolicyVersion.snapshot` JSON (with volatile key removal) | | **Fidelity** | `meta` (persisted as `fidelity='meta'` in snapshot context) | `content` (settings + assignments + scope_tags) | | **Dimensions detected** | `missing_policy`, `different_version`, `unexpected_policy` | `policy_snapshot` (added/removed/modified), `policy_assignments` (modified), `policy_scope_tags` (modified) | | **Finding identity** | `recurrence_key = sha256(tenantId\|snapshotId\|policyType\|extId\|changeType)` | `recurrence_key = sha256(drift:tenantId:scopeKey:subjectType:extId:dimension)` | | **Scope key** | `baseline_profile:{profileId}` | `DriftScopeKey::fromSelectionHash()` | | **Auto-close** | `BaselineAutoCloseService` (stale finding resolution) | `resolveStaleDriftFindings()` within `DriftFindingGenerator` | | **Coverage guard** | `InventoryCoverage::fromContext()` → uncovered types → partial outcome | None (trusts backup captured all types) | | **Graph API calls** | Zero at compare time (reads from DB) | Zero at compare time (reads PolicyVersions from DB) | | **Graph API calls (capture)** | Zero (inventory sync did LIST) | Per-item GET via `PolicyCaptureOrchestrator` | | **Normalizer pipeline** | None (meta contract is the normalization) | `SettingsNormalizer` → `PolicyNormalizer` → type normalizers | | **Shared components** | `DriftHasher`, `Finding` model | `DriftHasher`, `Finding` model | | **Trigger** | After inventory sync, on schedule/manual | After backup, on schedule/manual | ### Data Flow Diagrams ``` SYSTEM A — Baseline Compare (Meta Fidelity) ============================================ Graph LIST ──► InventorySyncService ──► InventoryItem (meta_jsonb) │ ▼ CaptureBaselineSnapshotJob ├─ InventoryMetaContract.build() ├─ DriftHasher.hashNormalized() └─► BaselineSnapshotItem (baseline_hash) │ ▼ CompareBaselineToTenantJob ├─ loadCurrentInventory() → InventoryItem ├─ BaselineSnapshotIdentity.hashItemContent() │ └─ InventoryMetaContract.build() │ └─ DriftHasher.hashNormalized() ├─ computeDrift() → hash compare └─ upsertFindings() → Finding records SYSTEM B — Backup Drift (Content Fidelity) ============================================ Graph GET ──► PolicySnapshotService.fetch() ──► full JSON snapshot │ ▼ PolicyCaptureOrchestrator.capture() ├─ assignments GET ├─ scope tags resolve └─► VersionService.captureVersion() ──► PolicyVersion │ ▼ DriftFindingGenerator.generate() ├─ versionForRun() → baseline/current PV ├─ SettingsNormalizer.normalizeForDiff() │ └─ PolicyNormalizer.flattenForDiff() ├─ DriftHasher.hashNormalized() × 3 │ (snapshot, assignments, scope_tags) └─ upsertDriftFinding() → Finding records ``` --- ## 3. ADR-001: Unify vs Separate ### Title ADR-001: Golden Master Baseline Compare — Provider Chain for Content Fidelity ### Status PROPOSED ### Context TenantPilot has two drift detection systems that evolved independently: - **System A (Baseline Compare)**: Designed for "does the tenant still match the golden master?" Use case. Ships with meta-fidelity (v1) — fast, cheap, zero additional Graph calls at compare time. Detects structural drift (policy added/removed/meta-changed) but is blind to _settings_ changes. - **System B (Backup Drift)**: Designed for "what changed between two backup points?" Use case. Content-fidelity — full PolicyVersion snapshots with per-type normalization. Detects settings, assignments, and scope tag changes. The two systems cannot be merged into one without fundamentally changing their triggering, scoping, and API cost models. However, System A's accuracy can be dramatically improved by _consuming_ data already produced by System B. ### Decision **Adopt the Provider Chain pattern** as already designed in Spec 116 v2: ``` ContentProvider = PolicyVersion → InventoryContent → MetaFallback ``` Specifically: 1. `CompareBaselineToTenantJob` gains a `ContentProviderChain` that, for each `(policy_type, external_id)`: - **First**: Looks for a `PolicyVersion` captured since the last baseline snapshot timestamp. If found, normalizes via `SettingsNormalizer` → `DriftHasher` → returns `content` fidelity hash. - **Second (future)**: Looks for enriched inventory content if inventory sync is upgraded to capture settings (v2.0+). - **Fallback**: Builds `InventoryMetaContract` v1 → `DriftHasher` → returns `meta` fidelity hash. 2. Each baseline snapshot item records its `fidelity` (`meta` | `content`) and `content_hash_source` (`inventory_meta_v1` | `policy_version:{id}` | `inventory_content_v2`). 3. Compare findings carry `fidelity` in evidence, enabling UI to display confidence level. 4. Coverage Guard is extended: a type is `content-covered` only if PolicyVersions exist for ≥N% of items. Below that threshold, fallback to meta fidelity (do not suppress). ### Consequences - **Positive**: No new Graph API calls needed (reuses existing PolicyVersions from backups). Zero additional infrastructure. Incremental rollout per policy type. Existing meta-fidelity behavior preserved as fallback. - **Negative**: Content fidelity depends on backup recency. If a tenant hasn't been backed up, only meta fidelity is available. Could create "mixed fidelity" findings within a single compare run. - **Rejected Alternative**: Full merge of System A and B into a single system. Rejected because they serve different use cases (golden master comparison vs point-in-time drift), have different scoping models (BaselineProfile vs selection_hash), and different triggering models (post-inventory-sync vs post-backup). - **Rejected Alternative**: Always-GET during baseline compare. Rejected due to API cost (30+ types × 100s of policies = 1000s of GET calls per tenant per compare run). ### Compliance Notes - Livewire v4.0+ / Filament v5: no UI changes in core ADR; provider chain is purely backend. - Provider registration: n/a (backend services only). - No destructive actions. - Asset strategy: no new assets. --- ## 4. Deep-Dive: Why Settings Changes Don't Produce Baseline Drift ### The Root Cause Chain **Step 1: Inventory Sync captures only LIST metadata** `InventorySyncService::executeSelectionUnderLock()` (line ~340-450) calls Graph LIST endpoints. For each policy, it extracts: - `display_name`, `category`, `platform` (display fields) - `odata_type`, `etag`, `scope_tag_ids`, `assignment_target_count` (meta signals) These are stored in `InventoryItem.meta_jsonb`. **No settings values are fetched or stored.** **Step 2: Baseline Capture hashes only the Meta Contract** `CaptureBaselineSnapshotJob::collectSnapshotItems()` reads from `InventoryItem`, then calls `BaselineSnapshotIdentity::hashItemContent()`: ```php // BaselineSnapshotIdentity.php, line 56-67 public function hashItemContent(string $policyType, string $subjectExternalId, array $metaJsonb): string { $contract = $this->metaContract->build( policyType: $policyType, subjectExternalId: $subjectExternalId, metaJsonb: $metaJsonb, ); return $this->hasher->hashNormalized($contract); } ``` The `InventoryMetaContract::build()` output is: ```php [ 'version' => 1, 'policy_type' => 'settingsCatalogPolicy', 'subject_external_id' => '', 'odata_type' => '#microsoft.graph.deviceManagementConfigurationPolicy', 'etag' => '"abc..."', // ← unreliable change indicator 'scope_tag_ids' => ['0'], 'assignment_target_count' => 3, ] ``` **This is ALL that gets hashed.** Actual policy settings (the Wi-Fi password, the compliance threshold, the firewall rule) are _nowhere_ in this contract. **Step 3: Baseline Compare re-computes the same meta hash** `CompareBaselineToTenantJob::loadCurrentInventory()` (line 367-409) reads current `InventoryItem` records and calls the same `BaselineSnapshotIdentity::hashItemContent()` with the same `InventoryMetaContract`, producing the same hash structure. `computeDrift()` (line 435-500) then compares `baseline_hash` vs `current_hash`: ```php if ($baselineItem['baseline_hash'] !== $currentItem['current_hash']) { $drift[] = ['change_type' => 'different_version', ...]; } ``` **If the admin changed a policy setting but the meta signals (etag, scope_tag_ids, assignment_target_count) stayed the same, `baseline_hash === current_hash` and NO drift is detected.** ### Why etag is unreliable Microsoft Graph etag behavior varies by resource type: - **Some types** update etag on any property change (including settings) - **Some types** update etag only on top-level property changes (not nested settings) - **Settings Catalog policies** may or may not update the parent resource etag when child `settings` are modified (the settings are a separate subresource at `/configurationPolicies/{id}/settings`) - **Group Policy Configurations** have settings in `definitionValues` → `presentationValues` (multi-level nesting); etag at root level may not reflect these changes ### The Contrast: How Backup Drift _Does_ Detect Settings Changes `DriftFindingGenerator::generate()` (line 32-80) operates on `PolicyVersion.snapshot` — the full JSON captured via per-item GET: ```php $baselineSnapshot = $baselineVersion->snapshot; // Full JSON from Graph GET $currentSnapshot = $currentVersion->snapshot; $baselineNormalized = $this->settingsNormalizer->normalizeForDiff($baselineSnapshot, $policyType, $platform); $currentNormalized = $this->settingsNormalizer->normalizeForDiff($currentSnapshot, $policyType, $platform); $baselineSnapshotHash = $this->hasher->hashNormalized($baselineNormalized); $currentSnapshotHash = $this->hasher->hashNormalized($currentNormalized); if ($baselineSnapshotHash !== $currentSnapshotHash) { // → Drift finding with change_type = 'modified' } ``` This pipeline captures actual settings values, normalizes them per policy type, strips volatile metadata, and hashes the result. If a setting changed, the hash changes, and drift is detected. ### Summary Visualization ``` Admin changes Wi-Fi password in Intune │ ▼ ┌─────────────────────────────────┐ │ Graph LIST (inventory sync) │ │ returns: displayName, etag, ... │ │ │ │ etag MAY change, settings NOT │ │ returned by LIST endpoint │ └────────────┬────────────────────┘ │ ┌───────┴────────┐ ▼ ▼ InventoryItem PolicyVersion (meta only) (if backup ran) │ │ ▼ ▼ Meta Contract Full Snapshot hash unchanged hash CHANGED │ │ ▼ ▼ Baseline Backup Drift: Compare: "modified" ✅ NO DRIFT ❌ ``` --- ## 5. Code Evidence Table | # | Class / File | Lines | Role | Key Finding | |---|---|---|---|---| | 1 | `app/Jobs/CompareBaselineToTenantJob.php` | 785 total; L367-409 (loadCurrentInventory), L435-500 (computeDrift) | Core baseline compare job | Reads from `InventoryItem` only; hashes via `InventoryMetaContract` → **blind to settings** | | 2 | `app/Services/Baselines/InventoryMetaContract.php` | 75 total; L30-57 (build) | Meta hash contract builder | Hashes only: version, policy_type, external_id, odata_type, etag, scope_tag_ids, assignment_target_count — **no settings content** | | 3 | `app/Services/Baselines/BaselineSnapshotIdentity.php` | 73 total; L56-67 (hashItemContent) | Per-item hash via meta contract | Delegates to `InventoryMetaContract.build()` → `DriftHasher.hashNormalized()` | | 4 | `app/Jobs/CaptureBaselineSnapshotJob.php` | 305 total | Captures snapshot from inventory | Reads `InventoryItem`, stores `fidelity='meta'` and `source='inventory'` | | 5 | `app/Services/Drift/DriftFindingGenerator.php` | 484 total; L32-80 (generate), L250-267 (recurrenceKey) | Backup drift finding generator | Uses `PolicyVersion.snapshot` with `SettingsNormalizer` → **detects settings changes** | | 6 | `app/Services/Drift/DriftHasher.php` | 100 total; L13-24 (hashNormalized) | Shared hasher | `sha256(json_encode(normalized))` with volatile key removal. **SHARED by both systems.** | | 7 | `app/Services/Drift/Normalizers/SettingsNormalizer.php` | 22 total | Thin wrapper | Delegates to `PolicyNormalizer::flattenForDiff()`. Used by System B only. | | 8 | `app/Services/Intune/PolicyNormalizer.php` | 67 total | Type-specific normalizer router | Routes to per-type normalizers for diff operations | | 9 | `app/Services/Inventory/InventorySyncService.php` | 652 total; L340-450 (executeSelectionUnderLock) | LIST-based sync | Fetches from Graph LIST endpoints; extracts meta signals only; upserts `InventoryItem` | | 10 | `app/Services/Intune/BackupService.php` | 438 total | Backup orchestration | Creates `BackupSet`, uses `PolicyCaptureOrchestrator` for per-item GET → PolicyVersion | | 11 | `app/Services/Intune/PolicyCaptureOrchestrator.php` | 429 total | Per-item GET + hydration | Fetches full snapshot, assignments, scope tags; creates PolicyVersion with all content | | 12 | `app/Services/Intune/PolicySnapshotService.php` | 852 total | Per-item Graph GET | Type-specific hydration (hydrateConfigurationPolicySettings, hydrateGroupPolicyConfiguration, etc.) | | 13 | `app/Services/Intune/VersionService.php` | 312 total; L1-150 (captureVersion) | PolicyVersion persistence | Transactional, locking, consecutive version_number | | 14 | `app/Models/PolicyVersion.php` | Model | PolicyVersion model | Casts: snapshot(array), assignments(array), scope_tags(array), plus hash columns | | 15 | `app/Models/InventoryItem.php` | Model | Inventory item model | Casts: meta_jsonb(array) — **no settings content** | | 16 | `app/Models/BaselineSnapshotItem.php` | Model | Snapshot item model | Has `baseline_hash(64)`, `meta_jsonb` | | 17 | `app/Support/Inventory/InventoryCoverage.php` | 173 total | Coverage parser | `fromContext()` extracts per-type status from sync run context | | 18 | `app/Services/Drift/DriftRunSelector.php` | ~60 total | Run pair selector | Selects 2 most recent sync runs with same `selection_hash` (System B only) | | 19 | `app/Jobs/GenerateDriftFindingsJob.php` | ~200 total | Dispatcher for System B | Dispatches `DriftFindingGenerator` for policy-version-based drift | | 20 | `config/graph_contracts.php` | 867 total | Policy type registry | Defines endpoints, hydration strategies, subresources, type families per policy type | | 21 | `config/tenantpilot.php` | 385 total; L18-293 (supported_policy_types) | Application config | 28 supported policy types + 3 foundation types | | 22 | `specs/116-baseline-drift-engine/spec.md` | 237 total | Feature spec | Defines v1 (meta) and v2 (content fidelity) requirements | | 23 | `specs/116-baseline-drift-engine/research.md` | 200 total | Phase 0 research | 6 key decisions including v2 architecture strategy | | 24 | `specs/116-baseline-drift-engine/plan.md` | 259 total | Implementation plan | Steps 1-7 for v1; v2 deferred | --- ## 6. Type Coverage Matrix Coverage assessment for deep-drift feasibility: **which types have per-type normalization and hydration support?** | # | `policy_type` | Label | Hydration | Subresources | Per-Type Normalizer | Deep-Drift Feasible | Notes | |---|---|---|---|---|---|---|---| | 1 | `settingsCatalogPolicy` | Settings Catalog Policy | `configurationPolicies` | `settings` (list) | Yes (via PolicyNormalizer) | **YES** | Most impactful — complex nested settings | | 2 | `endpointSecurityPolicy` | Endpoint Security Policies | `configurationPolicies` | `settings` (list) | Yes (shared with settings catalog) | **YES** | Same endpoint family as settings catalog | | 3 | `securityBaselinePolicy` | Security Baselines | `configurationPolicies` | `settings` (list) | Yes (shared) | **YES** | Same pipeline | | 4 | `groupPolicyConfiguration` | Administrative Templates | `groupPolicyConfigurations` | `definitionValues` → `presentationValues` | Yes (via PolicyNormalizer) | **YES** | Multi-level nesting; hydration required | | 5 | `deviceConfiguration` | Device Configuration | `deviceConfigurations` | None (properties on root) | Yes (via PolicyNormalizer) | **YES** | Properties directly on resource | | 6 | `deviceCompliancePolicy` | Device Compliance | `deviceCompliancePolicies` | `scheduledActionsForRule` (expand) | Yes (via PolicyNormalizer) | **YES** | Actions subresource needs expand | | 7 | `windowsUpdateRing` | Software Update Ring | `deviceConfigurations` (filtered) | None (properties on root) | Yes (shared with deviceConfig) | **YES** | Subset of deviceConfiguration | | 8 | `appProtectionPolicy` | App Protection (MAM) | `managedAppPolicies` | None (properties) | Partial (via PolicyNormalizer) | **YES** | Mobile-focused | | 9 | `conditionalAccessPolicy` | Conditional Access | `identity/conditionalAccess/policies` | None (properties) | Yes (via PolicyNormalizer) | **YES** | High-risk, preview-only restore | | 10 | `deviceManagementScript` | PowerShell Scripts | `deviceManagementScripts` | None (scriptContent base64) | Partial | **PARTIAL** | Script content is base64 in snapshot | | 11 | `deviceShellScript` | macOS Shell Scripts | `deviceShellScripts` | None (scriptContent base64) | Partial | **PARTIAL** | Same pattern as PS scripts | | 12 | `deviceHealthScript` | Proactive Remediations | `deviceHealthScripts` | None | Partial | **PARTIAL** | Detection + remediation scripts | | 13 | `deviceComplianceScript` | Custom Compliance Scripts | `deviceComplianceScripts` | None | Partial | **PARTIAL** | Script content | | 14 | `windowsFeatureUpdateProfile` | Feature Updates | `windowsFeatureUpdateProfiles` | None | Yes | **YES** | Simple properties | | 15 | `windowsQualityUpdateProfile` | Quality Updates | `windowsQualityUpdateProfiles` | None | Yes | **YES** | Simple properties | | 16 | `windowsDriverUpdateProfile` | Driver Updates | `windowsDriverUpdateProfiles` | None | Yes | **YES** | Simple properties | | 17 | `mamAppConfiguration` | App Config (MAM) | `targetedManagedAppConfigurations` | None | Partial | **YES** | Properties-based | | 18 | `managedDeviceAppConfiguration` | App Config (Device) | `mobileAppConfigurations` | None | Partial | **YES** | Properties-based | | 19 | `windowsAutopilotDeploymentProfile` | Autopilot Profiles | `windowsAutopilotDeploymentProfiles` | None | Minimal | **YES** | Properties-based | | 20 | `windowsEnrollmentStatusPage` | Enrollment Status Page | `deviceEnrollmentConfigurations` | None | Minimal | **META-ONLY** | Enrollment types have limited settings | | 21 | `deviceEnrollmentLimitConfiguration` | Enrollment Limits | `deviceEnrollmentConfigurations` | None | Minimal | **META-ONLY** | Numeric limit only | | 22 | `deviceEnrollmentPlatformRestrictionsConfiguration` | Platform Restrictions | `deviceEnrollmentConfigurations` | None | Minimal | **META-ONLY** | Nested restriction config | | 23 | `deviceEnrollmentNotificationConfiguration` | Enrollment Notifications | `deviceEnrollmentConfigurations` | None | Minimal | **META-ONLY** | Template snapshots nested | | 24 | `enrollmentRestriction` | Enrollment Restrictions | `deviceEnrollmentConfigurations` | None | Minimal | **META-ONLY** | Mixed config type | | 25 | `termsAndConditions` | Terms & Conditions | `termsAndConditions` | None | Yes | **YES** | bodyText, acceptanceStatement | | 26 | `endpointSecurityIntent` | Endpoint Security Intents | `intents` | categories/settings (legacy) | Partial | **PARTIAL** | Legacy intent API; migrating to configPolicies | | 27 | `mobileApp` | Applications | `mobileApps` | None | Minimal | **META-ONLY** | Metadata-only backup per config | | 28 | `policySet` | Policy Sets | (if supported) | assignments | Minimal | **META-ONLY** | Container for other policies | **Foundation Types:** | # | `foundation_type` | Label | Deep-Drift | Notes | |---|---|---|---|---| | F1 | `assignmentFilter` | Assignment Filter | **YES** | `rule` property is key content | | F2 | `roleScopeTag` | Scope Tag | **META-ONLY** | displayName + description only | | F3 | `notificationMessageTemplate` | Notification Template | **PARTIAL** | Localized messages are subresource | **Summary:** - **Full content-fidelity feasible**: 16 types (settingsCatalog, endpointSecurity, securityBaseline, groupPolicy, deviceConfig, compliance, updateRings/profiles, appProtection, conditionalAccess, appConfigs, autopilot, termsAndConditions, assignmentFilter) - **Partial** (script content / legacy APIs): 5 types - **Meta-only sufficient**: 7 types (enrollment configs, mobileApp, roleScopeTag) --- ## 7. Proposal: Deep Drift Implementation Plan ### Phase v1.5 — Provider Chain (Opportunistic Content Fidelity) **Goal**: Enable baseline compare to use existing PolicyVersions for content-fidelity hash when available, with meta-fidelity fallback. **Estimated effort**: 3-5 days #### Step 1: ContentHashProvider Interface ```php // app/Contracts/Baselines/ContentHashProvider.php interface ContentHashProvider { /** * @return array{hash: string, fidelity: string, source: string}|null */ public function resolve(string $policyType, string $externalId, int $tenantId, CarbonImmutable $since): ?array; } ``` #### Step 2: PolicyVersionContentProvider ```php // app/Services/Baselines/PolicyVersionContentProvider.php // Looks up the latest PolicyVersion for (tenant_id, external_id, policy_type) // captured_at >= $since (baseline snapshot timestamp) // Returns SettingsNormalizer → DriftHasher hash with fidelity='content' ``` #### Step 3: MetaFallbackProvider (existing logic) ```php // Wraps InventoryMetaContract → DriftHasher → fidelity='meta' ``` #### Step 4: ContentProviderChain ```php // Iterates [PolicyVersionContentProvider, MetaFallbackProvider] // Returns first non-null result ``` #### Step 5: Integration in CompareBaselineToTenantJob - `loadCurrentInventory()` accepts optional `ContentProviderChain` - For each item: try chain, record fidelity + source - `computeDrift()` unchanged (still hash vs hash comparison) - Finding evidence includes `fidelity` and `content_hash_source` #### Step 6: CaptureBaselineSnapshotJob enhancement - Optional: during capture, also try `PolicyVersionContentProvider` to store content-fidelity baseline_hash - Store `content_hash_source` in `baseline_snapshot_items.meta_jsonb` - This means: if a backup was taken before baseline capture, the baseline itself is content-fidelity #### Step 7: Coverage extension - Add `content_coverage` to compare run context: which types had PolicyVersions, which fell back to meta - Display in operation detail UI #### Migration ```sql -- Optional: add column for source tracking ALTER TABLE baseline_snapshot_items ADD COLUMN content_hash_source VARCHAR(255) NULL DEFAULT 'inventory_meta_v1'; ``` ### Phase v2.0 — On-Demand Content Capture (Future) **Goal**: For types without recent PolicyVersions, perform targeted per-item GET during baseline capture/compare. **Estimated effort**: 5-8 days - Introduce `BaselineContentCaptureJob` that, for a given baseline profile's scope, identifies items lacking recent PolicyVersions and performs targeted GET + PolicyVersion creation. - Reuses existing `PolicyCaptureOrchestrator` with a new "baseline-triggered" context. - Adds `capture_mode` to baseline profile: `meta_only` (v1), `opportunistic` (v1.5), `full_content` (v2.0). - Rate limiting: per-tenant throttle to avoid Graph API quota issues. - Budget guard: max N items per capture run, with continuation support. ### Phase v2.5 — Inventory Content Enrichment (Future, Optional) **Goal**: Optionally have inventory sync capture settings content inline during LIST (where type supports `$expand`). - Some types support `$expand=settings` on LIST (settings catalog, endpoint security). - This would give "free" content fidelity without per-item GET. - High complexity: varies per type, may increase LIST payload size significantly. - Evaluate ROI after v2.0 ships. --- ## 8. Test Plan (Enterprise) ### Unit Tests | # | Test File | Scope | Key Assertions | |---|---|---|---| | U1 | `tests/Unit/Baselines/ContentProviderChainTest.php` | Provider chain resolution | First provider wins; null fallback; fidelity recorded correctly | | U2 | `tests/Unit/Baselines/PolicyVersionContentProviderTest.php` | PolicyVersion lookup + normalization | Correct hash for known snapshot; returns null when no PV; respects `$since` cutoff | | U3 | `tests/Unit/Baselines/MetaFallbackProviderTest.php` | Meta contract fallback | Produces `fidelity='meta'`; matches existing `InventoryMetaContract` behavior exactly | | U4 | `tests/Unit/Baselines/InventoryMetaContractTest.php` | (Existing) contract stability | Null handling, ordering, versioning — extend for edge cases | ### Feature Tests | # | Test File | Scope | Key Assertions | |---|---|---|---| | F1 | `tests/Feature/Baselines/BaselineCompareContentFidelityTest.php` | End-to-end compare with PolicyVersions available | Settings change → `different_version` finding with `fidelity='content'` | | F2 | `tests/Feature/Baselines/BaselineCompareMixedFidelityTest.php` | Some types have PV, some don't | Mixed `fidelity` values in findings; coverage context records both | | F3 | `tests/Feature/Baselines/BaselineCompareFallbackTest.php` | No PolicyVersions available | Falls back to meta fidelity; identical behavior to v1 | | F4 | `tests/Feature/Baselines/BaselineCaptureFidelityTest.php` | Capture with PolicyVersions present | `baseline_hash` uses content fidelity; `content_hash_source` recorded | | F5 | `tests/Feature/Baselines/BaselineCompareStaleVersionTest.php` | PolicyVersion older than snapshot | Falls back to meta (stale PV not used) | | F6 | `tests/Feature/Baselines/BaselineCompareCoverageGuardContentTest.php` | Coverage reporting for content types | `content_coverage` in run context shows which types are content-covered | ### Existing Tests to Preserve | # | Test File | Impact | |---|---|---| | E1 | `tests/Feature/Baselines/BaselineCompareFindingsTest.php` | Must still pass — meta fidelity is default when no PV exists | | E2 | `tests/Feature/Baselines/BaselineComparePreconditionsTest.php` | No change expected | | E3 | `tests/Feature/Baselines/BaselineCompareStatsTest.php` | Stats remain grouped by scope_key; may need fidelity breakdown | | E4 | `tests/Feature/Baselines/BaselineOperabilityAutoCloseTest.php` | Auto-close unaffected by fidelity source | ### Integration / Regression | # | Test | Scope | |---|---|---| | I1 | Content hash stability across serialization | JSON encode/decode round-trip does not change hash | | I2 | PolicyVersion normalizer alignment | Same snapshot → `SettingsNormalizer` produces same hash in both System A (via provider) and System B (via DriftFindingGenerator) | | I3 | Hash collision protection | Different settings → different hashes (property-based test with sample data) | | I4 | Empty snapshot edge case | PolicyVersion with empty/null snapshot → provider returns null → fallback works | ### Performance Tests | # | Test | Acceptance Criteria | |---|---|---| | P1 | Compare job with 500 items, 50% with PolicyVersions | Completes in < 30s (DB-only, no Graph calls) | | P2 | Provider chain query efficiency | PolicyVersion lookup uses batch query, not N+1 | --- ## 9. Open Questions / Assumptions ### Open Questions | # | Question | Impact | Proposed Resolution | |---|---|---|---| | OQ-1 | **Staleness threshold for PolicyVersions**: How old can a PolicyVersion be before we reject it as a content source? | Determines false-negative risk | Default: PolicyVersion must be captured after the baseline snapshot's `captured_at`. Configurable per workspace. | | OQ-2 | **Mixed fidelity UX**: How should the UI display findings with different fidelity levels? | User trust and understanding | Badge/icon on finding cards: "High confidence (content)" vs "Structural only (meta)". Filterable in findings table. | | OQ-3 | **Should baseline capture _force_ a backup** if no recent PolicyVersions exist? | API cost vs accuracy trade-off | No for v1.5 (opportunistic only). Yes for v2.0 as opt-in `capture_mode: full_content`. | | OQ-4 | **etag as change hint**: Should we use etag changes as a _trigger_ for on-demand PolicyVersion capture? | Could reduce unnecessary GETs | Worth investigating in v2.0. If etag changes during inventory sync, schedule targeted per-item GET for that policy only. | | OQ-5 | **Settings Catalog `$expand=settings`** on LIST: Does Microsoft Graph support this? | Could give "free" content fidelity for settings catalog types | Needs validation against Graph API. If supported, would eliminate per-item GET for the most impactful type. | | OQ-6 | **Retention / pruning interaction**: If old PolicyVersions are pruned, does that affect baseline compare? | Could lose content fidelity for old baselines | Baseline compare only needs versions captured _after_ baseline snapshot. Pruning policy should respect active baseline snapshots. | ### Assumptions | # | Assumption | Risk if Wrong | |---|---|---| | A-1 | `DriftHasher::hashNormalized()` is deterministic across PHP serialization boundaries | Hash mismatch → false drift findings. **Validated**: uses `json_encode` with stable flags + `ksort`. | | A-2 | `SettingsNormalizer` / `PolicyNormalizer` produce the same output for the same input regardless of call context (System A vs System B) | Hash inconsistency between systems. **Low risk**: same code path. | | A-3 | PolicyVersions from backups contain complete settings (not partial hydration) | Incomplete content → false negatives or incorrect hashes. **Validated**: `PolicySnapshotService` performs full hydration per type. | | A-4 | The `Finding` model's `fingerprint`/`recurrence_key` identity allows mixed fidelity sources | Identity collision if fidelity changes source. **Safe**: recurrence_key includes snapshot_id, not hash value. | | A-5 | Graph LIST endpoints do NOT return settings values for any supported policy type | If wrong, inventory sync could capture settings "for free". **Validated**: LIST returns only `$select` fields per `graph_contracts.php`. | | A-6 | Per-type normalizers in backup drift path handle all 28 supported policy types | If not, some types would produce unstable hashes. **Partially validated**: `PolicyNormalizer` has a fallback for unknown types. | --- ## 10. Key Questions Answered ### KQ-01: Are Baseline Compare and Backup Drift truly separate systems? **Yes.** They share `DriftHasher` and the `Finding` model, but differ in: - Data source: `InventoryItem` vs `PolicyVersion` - Hash contract: `InventoryMetaContract` (7 fields, meta only) vs `SettingsNormalizer → PolicyNormalizer` (full snapshot) - Finding generator: `CompareBaselineToTenantJob::computeDrift()` vs `DriftFindingGenerator::generate()` - Finding identity: different recurrence key structures - Scope model: `BaselineProfile`-scoped vs `selection_hash`-scoped - Trigger: post-inventory-sync vs post-backup - Coverage: `InventoryCoverage` guard vs none (trusts backup completeness) ### KQ-02: Should they be unified or remain separate? **Hybrid approach (Provider Chain)** — as designed in Spec 116 v2. Keep separate triggering and scoping, but let System A _consume_ data produced by System B (PolicyVersions) via a provider chain. This avoids: - Merging two fundamentally different scoping models - Introducing new Graph API costs - Disrupting existing backup drift workflows ### KQ-03: What is the minimal viable "v1.5" to bridge the gap? Add a `PolicyVersionContentProvider` that checks for recent PolicyVersions as part of baseline compare's hash computation. For types where a PolicyVersion exists (i.e., a backup was taken), the compare immediately gains content-fidelity. For types without, meta-fidelity continues as before. **Net code change: ~200-300 lines** (interface + 2 providers + chain + integration). ### KQ-04: Which types benefit most from content-fidelity drift? **Top priority** (complex settings, high change frequency): 1. `settingsCatalogPolicy` — most common, deeply nested settings 2. `groupPolicyConfiguration` — multi-level nesting (definitionValues → presentationValues) 3. `deviceCompliancePolicy` — compliance rules + scheduled actions 4. `deviceConfiguration` — broad category, many OData sub-types 5. `endpointSecurityPolicy` — critical security settings 6. `securityBaselinePolicy` — security-critical baselines 7. `conditionalAccessPolicy` — identity security gate **Medium priority** (simpler settings but still valuable): 8. `appProtectionPolicy`, `windowsUpdateRing`, `windowsFeatureUpdateProfile`, `windowsQualityUpdateProfile` ### KQ-05: How does coverage work and how should it extend for content fidelity? Currently: `InventoryCoverage::fromContext(latestSyncRun->context)` → `coveredTypes()` returns types with `status=succeeded`. Uncovered types → findings suppressed, outcome = `partially_succeeded`. For v1.5: Add `content_coverage` alongside `meta_coverage`: - `content_covered_types`: types where PolicyVersion exists post-baseline - `meta_only_types`: types where only meta is available - `uncovered_types`: types with no coverage at all (findings suppressed) Finding evidence should include: ```json { "fidelity": "content", "content_hash_source": "policy_version:42", "note": "Hash computed from PolicyVersion #42 captured 2025-07-14T10:30:00Z" } ``` ### KQ-06: What is the long-term unified architecture? **Provider precedence chain** with configurable capture modes: ``` BaselineProfile.capture_mode: 'meta_only' → InventoryMetaContract only (v1) 'opportunistic' → PolicyVersion if available → meta fallback (v1.5) 'full_content' → On-demand GET for missing types → PolicyVersion → meta (v2.0) ContentProviderChain: 1. PolicyVersionContentProvider (checks existing PolicyVersions) 2. InventoryContentProvider (future: if inventory sync enriched) 3. MetaFallbackProvider (InventoryMetaContract v1) ``` The long-term vision is that baseline capture + compare use the **same normalizer pipeline** as backup drift, producing identical hashes for identical content regardless of which system produced the PolicyVersion. This is achievable because `DriftHasher` and `SettingsNormalizer` are already shared code. --- ## Appendix: Database Schema Reference ### `baseline_snapshot_items` (current) ``` id BIGINT PK baseline_snapshot_id BIGINT FK → baseline_snapshots subject_type VARCHAR(255) -- 'policy' subject_external_id VARCHAR(255) -- Graph resource GUID policy_type VARCHAR(255) -- e.g. 'settingsCatalogPolicy' baseline_hash VARCHAR(64) -- sha256 of InventoryMetaContract meta_jsonb JSONB -- {display_name, category, platform, meta_contract: {...}, fidelity, source} created_at TIMESTAMP updated_at TIMESTAMP ``` ### `inventory_items` (current) ``` id BIGINT PK tenant_id BIGINT FK → tenants policy_type VARCHAR(255) external_id VARCHAR(255) display_name VARCHAR(255) category VARCHAR(255) NULL platform VARCHAR(255) NULL meta_jsonb JSONB -- {odata_type, etag, scope_tag_ids, assignment_target_count} last_seen_at TIMESTAMP NULL last_seen_operation_run_id BIGINT NULL created_at TIMESTAMP updated_at TIMESTAMP ``` ### `policy_versions` (current) ``` id BIGINT PK tenant_id BIGINT FK → tenants policy_id BIGINT FK → policies version_number INTEGER policy_type VARCHAR(255) platform VARCHAR(255) NULL created_by VARCHAR(255) NULL captured_at TIMESTAMP snapshot JSON -- FULL Graph GET response (hydrated) metadata JSON -- additional metadata assignments JSON NULL -- full assignments array scope_tags JSON NULL -- scope tag IDs assignments_hash VARCHAR(64) NULL scope_tags_hash VARCHAR(64) NULL created_at TIMESTAMP updated_at TIMESTAMP deleted_at TIMESTAMP NULL -- soft delete ``` ### Proposed v1.5 Addition ```sql ALTER TABLE baseline_snapshot_items ADD COLUMN content_hash_source VARCHAR(255) NULL DEFAULT 'inventory_meta_v1'; -- Values: 'inventory_meta_v1', 'policy_version:{id}', 'inventory_content_v2' ```